Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
A Joint Learning Approach to Few-Shot Learning for Multi-category Sentiment Classification
LI Zicheng, CHANG Xiaoqin, LI Yameng, LI Shoushan, ZHOU Guodong
Acta Scientiarum Naturalium Universitatis Pekinensis    2023, 59 (1): 57-64.   DOI: 10.13209/j.0479-8023.2022.068
Abstract285)   HTML    PDF(pc) (788KB)(130)       Save
Most few-shot learning approaches can’t get satisfactory results in fine-grained multi-category sentiment classification tasks. To solve this problem, a joint learning approach is proposed to few-shot learning for multi-category sentiment classification. Specifically, we utilize the pre-trained token-replaced detection model as few-shot learners and concurrently reformulate fine-grained sentiment classification tasks as both classification and regression problems by appending classification and regression templates and label description words to the input at the same time. For joint learning, several fusion methods are proposed to fuse the classification prediction and regression prediction. Experimental results show that, compared to mainstream few-shot methods, the proposed approach apparently achieves better performances in F1-Score and accuracy rate.
Related Articles | Metrics | Comments0
Document Constrained Translation Quality Estimation Model
FENG Qin, GONG Zhengxian, YE Heng, ZHOU Guodong
Acta Scientiarum Naturalium Universitatis Pekinensis    2023, 59 (1): 39-47.   DOI: 10.13209/j.0479-8023.2022.067
Abstract277)   HTML    PDF(pc) (941KB)(68)       Save
This paper proposes a new translation quality estimation model that does not rely on the reference translation to score the translation of each sentence in the source language. The authors model the sentence-level semantic difference and word-level referential difference between the source and translation and design additional loss function to make the model constrain the differences as much as possible when predicting scores. The experimental results show that proposed method can effectively improve the performance of quality estimation model. Compared with the baseline system, the proposed method improves the Pearson correlation coefficient by up to 6.68 percentage points.
Related Articles | Metrics | Comments0
Multimodal Emotion Recognition with Auxiliary Sentiment Information
WU Liangqing, LIU Qiyuan, ZHANG Dong, WANG Jiancheng, LI Shoushan, ZHOU Guodong
Acta Scientiarum Naturalium Universitatis Pekinensis    2020, 56 (1): 75-81.   DOI: 10.13209/j.0479-8023.2019.105
Abstract1630)   HTML    PDF(pc) (1064KB)(239)       Save
Different from the previous studies with only text, this paper focuses on multimodal data (text and audio) to perform emotion recognition. To simultaneously address the characteristics of multimodal data, we propose a novel joint learning framework, which allows auxiliary task (multimodal sentiment classification) to help the main task (multimodal emotion classification). Specifically, private neural layers are designed for text and audio modalities from the main task to learn the uni-modal independent dynamics. Secondly, with the shared neural layers from auxiliary task, we obtain the uni-modal representations of the auxiliary task and the auxiliary representations of the main task. The uni-modal independent dynamics is combined with the auxiliary representations for each modality to acquire the uni-modal representations of the main task. Finally, in order to capture multimodal interactive dynamics, we fuse the text and audio modalities’ representations for the main and auxiliary tasks separately to obtain the final multimodal emotion and sentiment representations with the self attention mechanism. Empirical results demonstrate the effectiveness of our approach to multimodal emotion classification task as well as the sentiment classification task.
Related Articles | Metrics | Comments0
Building Chinese Zero Corpus Form Discourse Perspective
SHENG Chen, KONG Fang, ZHOU Guodong
Acta Scientiarum Naturalium Universitatis Pekinensis    2019, 55 (1): 15-21.   DOI: 10.13209/j.0479-8023.2018.057
Abstract829)   HTML    PDF(pc) (672KB)(267)       Save

To better deal with Chinese zero elements, this paper makes a theoretical analysis from discourse perspective and completes the construction of the Chinese Discourse Zero Corpus (CDZC). First, the necessity of corpus construction has been explored based on the research of existing theoretical and data sources. Then, the topdown and forword search annotation strategy and the combination of the human machine are used to complete corpus annotation. Finally, the detailed statistics analysis shows that CDZC can fully reflect the characters of Chinese linguistic and provide corpus resources for related research.

Related Articles | Metrics | Comments0
Recognizing the Ellipsis of Opinion Target in Chinese Text
ZHU Zhu,WANG Rong,LI Shoushan,ZHOU Guodong
Acta Scientiarum Naturalium Universitatis Pekinensis   
Abstract846)      PDF(pc) (404KB)(381)       Save
A novel method is proposed to recognize the ellipsis of opinion target in Chinese text. The approach treats the task of opinion target ellipsis as a binary classification problem, which applies the machine learning algorithm. Then three kinds of features, namely position-independent features of sentence, position-dependent features of sentence and contextual features, are applied to the recognition task separately. The experimental results in three domains demonstrate that the machine learning-based method is effective for the task of the recognition of opinion target ellipsis.
Related Articles | Metrics | Comments0
Automatic Recognition and Classification on Chinese Discourse Connective
LI Yancui,SUN Jing,ZHOU Guodong
Acta Scientiarum Naturalium Universitatis Pekinensis   
Abstract1202)      PDF(pc) (621KB)(944)       Save
Based on the annotation of discourse connective in Chinese Discourse Treebank, especially the annotation of the connective and its relation classification. The authors extract syntax, lexical and position features of automatic syntax tree and standard syntax tree, and use supervised method to recognize and classify connective. Experimental results show that connective recognition F1-measure is 69.2%, and connective classification accuracy is 89.1%.
Related Articles | Metrics | Comments0
Recognition and Classification of Relation Words in the Compound Sentences Based on Tsinghua Chinese Treebank
LI Yancui,SUN Jing,ZHOU Guodong,FENG Wenhe
Acta Scientiarum Naturalium Universitatis Pekinensis   
Abstract740)      PDF(pc) (426KB)(460)       Save
According to Tsinghua Chinese Treebank annotation methods, the authors extracted relation words and marked their categories. Then syntax, lexical and position features of automatic syntax tree with and without functional marker were extracted to recognize and classify relation words. Experiment results show that relative recognition accuracy is 95.7%, and relation words classification F1 is 77.2%.
Related Articles | Metrics | Comments0
Research of Chinese Implicit Discourse Relation Recognition
SUN Jing,LI Yancui,ZHOU Guodong,FENG Wenhe
Acta Scientiarum Naturalium Universitatis Pekinensis   
Abstract787)      PDF(pc) (532KB)(729)       Save
The authors use a self-built Chinese Discourse Treebank (80% relations are implicit) to recognize implicit relations. In this corpus, discourse relations are divided into three layers, the first layer has four types: causality, coordination, transition and explanation. Based on this corpus, maximum entropy classifier is employed to identify four types relations with context, lexical and dependency parse features. Experimental results show that total accuracy is 62.15% and the identification effect of coordination is the best, F1 reaches 75.26%.
Related Articles | Metrics | Comments0
Pronoun Resolution Based on Deep Learning
XI Xuefeng,ZHOU Guodong
Acta Scientiarum Naturalium Universitatis Pekinensis   
Abstract856)      PDF(pc) (589KB)(486)       Save
Because coreference resolution is a fundamental task in natural language process, a coreference resolution system based on Deep Learning model via the deep belief nets (DBN), which is a classifier of a combination of several unsupervised learning networks, named RBM (restricted Boltzmann machine) and a supervised learning network named BP (back-propagation), is proposed to detect and classify the coreference relationships between the anaphor and antecedent. The RBM layers maintain as much information as possible when feature vectors are transferred to next layer. The BP layer is trained to classify the features generated by the last RBM layer. The experiments are conducted on the ACE 2004 English NWIRE corpus and the ACE 2005 Chinese NWIRE corpus. The results show that increasing the number of layers RBM training and joining of abstract layer for feature set are able to improve the performance of coreference resolution system.
Related Articles | Metrics | Comments0
Research of Chinese Clause Identificiton Based on Comma
LI Yancui,FENG Wenhe,ZHOU Guodong,ZHU Kunhua
Acta Scientiarum Naturalium Universitatis Pekinensis   
Abstract750)      PDF(pc) (462KB)(541)       Save
According to the task of Chinese discourse analysis and practice, combined with traditional study, the authors propose clause as basic discourse unit and give its definition from the structure, function, form etc. The authors analyse the relationship between the comma and clause, and research clause identification using comma on annotation corpus. The corpus labeled whether each comma can be regarded as clause boundary information extract from CTB6.0, and have total of 2171 commas in 1348 sentences. The authors extract syntax, vocabulary, length features for experiment, and clauses identification accuracy can reach 90%. Nine greatest contribution features are chosen by information gain, they can obtain high clauses identification accuracy. Finally only using morphology feature, the accuracy can reach 84.5%. Experiments show that the definition of clause is reasonable and identification clause based on the comma is feasible.
Related Articles | Metrics | Comments0